## The University of Alabama in Huntsville ECE Department CPE 221 01 Fall 2019 Homework #6 Solution

7.97 A processor executes an instruction in the following six stages. The time required by each stage in picoseconds (1,000 ps = 1 ns) is given for each stage.

| IF | Instruction fetch         | 250 ps |
|----|---------------------------|--------|
| ID | Instruction decode        | 120 ps |
| OF | Operand fetch             | 220 ps |
| OE | Execute                   | 300 ps |
| M  | Memory access             | 500 ps |
| OS | Operand store (writeback) | 180 ps |

a. What is the time to execute an instruction if the processor is not pipelined?

Time = sum of all stages = (250 + 120 + 220 + 300 + 500 + 180) ps = 1570 ps = 1.57 ns

- b. What is the time taken to fully execute an instruction assuming that this structure is pipelined in six stages and that there is an additional 15 ps per stage due to the pipeline latches?
   Time = number of stages \* (max (IF, ID, OF, OE, M, OS) + 15 ps) = 6\*(500 +15)ps = 3090 ps = 3.09 ns
- Once the pipeline is full, what is the average instruction execution time?
   Once the pipeline is full, an instruction is executed every one cycle or 515 ps.
- d. Suppose that 30% of instructions are branch instructions that are taken and cause a 4-cycle penalty, what is the effective instruction execute time?

```
0.7 non-branch * 515 ps + 0.3 branch taken * 515 ps * 5 = 360.5 ps + 772.5 ps = 1133 ps = 1.133 ns
```

**7.98** A RISC processor executes the following code. There are no data dependencies.

```
ADD r0, r1, r2
ADD r3, r4, r5
ADD r6, r7, r8
ADD r9, r10, r11
ADD r12, r13, r14
ADD r15, r16, r17
```

a. Assuming a four-stage pipeline (fetch (IF), operand fetch (OF), execute (E), and write (W)) what registers are being read during the fourth clock cycle and what register is being written?

|     | Cycle         | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8 | 9 |
|-----|---------------|----|----|----|----|----|----|----|---|---|
| ADD | r0, r1, r2    | IF | OF | E  | W  |    |    |    |   |   |
| ADD | r3, r4, r5    |    | IF | OF | E  | W  |    |    |   |   |
| ADD | r6, r7, r8    |    |    | IF | OF | Е  | W  |    |   |   |
| ADD | r9, r10, r11  |    |    |    | IF | OF | Е  | W  |   |   |
| ADD | r12, r13, r14 |    |    |    |    | IF | OF | Ε  | W |   |
| ADD | r15, r16, r17 |    |    |    |    |    | IF | OF | Е | W |

Reading r7, and r8 (OF stage), writing r0 (W stage).

b. Assuming a five-stage pipeline (fetch (IF), operand fetch (OF), execute (E), memory (M), and write register (W)) what registers are being read in the seventh clock cycle and what register is being written?

| Cycle             | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8 | 9 | 10 |
|-------------------|----|----|----|----|----|----|----|---|---|----|
| ADD r0, r1, r2    | IF | OF | E  | M  | W  |    |    |   |   |    |
| ADD r3, r4, r5    |    | IF | OF | Е  | M  | W  |    |   |   |    |
| ADD r6, r7, r8    |    |    | IF | OF | Е  | M  | W  |   |   |    |
| ADD r9, r10, r11  |    |    |    | IF | OF | Е  | M  | W |   |    |
| ADD r12, r13, r14 |    |    |    |    | IF | OF | Е  | M | W |    |
| ADD r15, r16, r17 |    |    |    |    |    | IF | OF | Е | M | W  |

Reading r16, and r17 (OF stage), writing r6 (W stage)

## 7.99 Consider the following code:

```
LDR r1, [r6] ; Load r1 from memory. r6 is a pointer
ADD r1, r1, #1 ; Increment r1 by 1

LDR r2, [r6, #4] ; Load r2 from memory

ADD r2, r2, #1 ; Increment r2 by 1

ADD r3, r1, r2 ; Add r1 and r2 with total in r3

ADD r8, r8, #4 ; Increment r8 by 4

STR r2, [r6, #8] ; Store r2 in memory

SUB r2, r2, #64 ; Subtract 64 from r2
```

The processor has a six-stage pipeline F D O E M S; that is, instruction fetch, instruction decode, operand fetch, execute, memory access, and operand writeback to register file. Assume that the register file is not capable of writing and reading in the same cycle.

a. How many cycles does this code take to execute assuming internal forwarding is not used?

| Cycle |     |          | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | <b>15</b> | 16 | <b>17</b> | 18 | 19 | 20 | 21 | 22 |
|-------|-----|----------|---|---|---|---|---|---|---|---|---|----|----|----|----|----|-----------|----|-----------|----|----|----|----|----|
| LDR   | r1, | [r6]     | F | D | 0 | Ε | M | S |   |   |   |    |    |    |    |    |           |    |           |    |    |    |    |    |
| ADD   | r1, | r1, #1   |   | F | D | 0 | 0 | 0 | 0 | Ε | M | S  |    |    |    |    |           |    |           |    |    |    |    |    |
| LDR   | r2, | [r6, #4] |   |   | F | D | D | D | D | 0 | Ε | M  | S  |    |    |    |           |    |           |    |    |    |    |    |
| ADD   | r2, | r2, #1   |   |   |   |   |   |   | F | D | 0 | 0  | 0  | 0  | Е  | М  | S         |    |           |    |    |    |    |    |
| ADD   | r3, | r1, r2   |   |   |   |   |   |   |   | F | D | D  | D  | D  | 0  | 0  | 0         | 0  | Ε         | M  | S  |    |    |    |
| ADD   | r8, | r8, #4   |   |   |   |   |   |   |   |   | F | F  | F  | F  | D  | D  | D         | D  | 0         | Ε  | M  | S  |    |    |
| STR   | r2, | [r6, #8] |   |   |   |   |   |   |   |   |   |    |    |    | F  | F  | F         | F  | D         | 0  | Ε  | М  | S  |    |
| SUB   | r2, | r2, #64  |   |   |   |   |   |   |   |   |   |    |    |    |    |    |           |    | F         | D  | 0  | Ε  | M  | S  |

22 cycles

b. How many cycles does the code take to execute assuming internal forwarding is used?

| Cycle |     |          | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | <b>15</b> | 16 |
|-------|-----|----------|---|---|---|---|---|---|---|---|---|----|----|----|----|----|-----------|----|
| LDR   | r1, | [r6]     | F | D | 0 | Е | М | S |   |   |   |    |    |    |    |    |           |    |
| ADD   | r1, | r1, #1   |   | F | D | 0 | 0 | Ε | M | S |   |    |    |    |    |    |           |    |
| LDR   | r2, | [r6, #4] |   |   | F | D | D | 0 | Ε | M | S |    |    |    |    |    |           |    |
| ADD   | r2, | r2, #1   |   |   |   | F | F | D | D | 0 | Ε | M  | S  |    |    |    |           |    |
| ADD   | r3, | r1, r2   |   |   |   |   |   | F | F | D | 0 | Е  | M  | S  |    |    |           |    |
| ADD   | r8, | r8, #4   |   |   |   |   |   |   |   | F | D | 0  | Ε  | M  | S  |    |           |    |
| STR   | r2, | [r6, #8] |   |   |   |   |   |   |   |   | F | D  | 0  | 0  | Ε  | М  | S         |    |
| SUB   | r2, | r2, #64  |   |   |   |   |   |   |   |   |   | F  | D  | D  | 0  | Ε  | М         | S  |

16 cycles